Re-enabling high-speed caching for LSM-trees

نویسندگان

  • Lei Guo
  • Dejun Teng
  • Rubao Lee
  • Feng Chen
  • Siyuan Ma
  • Xiaodong Zhang
چکیده

LSM-tree has been widely used in cloud computing systems by Google, Facebook, and Amazon, to achieve high performance for write-intensive workloads. However, in LSMtree, random key-value queries can experience long latency and low throughput due to the interference from the compaction, a basic operation in the algorithm, to caching. LSM-tree relies on frequent compaction operations to merge data into a sorted structure. After a compaction, the original data are reorganized and written to other locations on the disk. As a result, the cached data are invalidated since their referencing addresses are changed, causing serious performance degradations. We propose dLSM in order to re-enable high-speed caching during intensive writes. dLSM is an LSM-tree with a compaction buffer on the disk, working as a cushion to minimize the cache invalidation caused by compactions. The compaction buffer maintains a series of snapshots of the frequently compacted data, which represent a consistent view of the corresponding data in the underlying LSM-tree. Being updated in a much lower rate than that of compactions, data in the compaction buffer are almost stationary. In dLSM, an object is referenced by the disk address of the corresponding block either in the compaction buffer for frequently compacted data, or in the underlying LSM-tree for infrequently compacted data. Thus, hot objects can be effectively kept in the cache without harmful invalidations. With the help of a small on-disk compaction buffer, dLSM achieves a high query performance by enabling effective caching, while retaining all merits of LSM-tree for write-intensive data processing. We have implemented dLSM based on LevelDB. Our evaluations show that with a standard DRAM cache, dLSM can achieve 5–8x performance improvement over LSM with the same cache on HDD storage.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Route Caching: The World Should Be Flat

Internet routers’ forwarding tables (FIBs), which must be stored in expensive fast memory for high-speed packet forwarding, are growing quickly in size due to increased multihoming, finer-grained traffic engineering, and deployment of IPv6 and VPNs. To address this problem, several Internet architectures have been proposed to reduce FIB size by returning to the earlier approach of route caching...

متن کامل

Storage Management in AsterixDB

Social networks, online communities, mobile devices, and instant messaging applications generate complex, unstructured data at a high rate, resulting in large volumes of data. This poses new challenges for data management systems that aim to ingest, store, index, and analyze such data efficiently. In response, we released the first public version of AsterixDB, an open-source Big Data Management...

متن کامل

Payload Caching: High-Speed Data Forwarding for Network Intermediaries

Large-scale network services such as data delivery often incorporate new functions by interposing intermediaries on the network. Examples of forwarding intermediaries include rewalls, content routers, protocol converters, caching proxies, and multicast servers. With the move toward network storage, even static Web servers act as intermediaries to forward data from storage to clients. This paper...

متن کامل

Design and Improvement of the Thrust Force Characteristic of a LSM for a propulsion/levitation of the 700km/h-speed tube train

In case of high-speed maglev trains, the propulsion force of their Linear Synchronous Motor (LSM) is an essential performance element since the LSM is responsible for both the propulsion and levitation of the train. A large thrust force ripple causes vibration, noise and severe levitation disturbance during the train operation. Because of these reasons, efforts must be made to reduce the thrust...

متن کامل

Liquid State Machine Learning for Resource and Cache Management in LTE-U Unmanned Aerial Vehicle (UAV) Networks

In this paper, the problem of joint caching and resource allocation is investigated for a network of cache-enabled unmanned aerial vehicles (UAVs) that service wireless ground users over the LTE licensed and unlicensed (LTE-U) bands. The considered model focuses on users that can access both licensed and unlicensed bands while receiving contents from either the cache units at the UAVs directly ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1606.02015  شماره 

صفحات  -

تاریخ انتشار 2016